46 research outputs found

    Co-designing reliability and performance for datacenter memory

    Get PDF
    Memory is one of the key components that affects reliability and performance of datacenter servers. Memory in today’s servers is organized and shared in several ways to provide the most performant and efficient access to data. For example, cache hierarchy in multi-core chips to reduce access latency, non-uniform memory access (NUMA) in multi-socket servers to improve scalability, disaggregation to increase memory capacity. In all these organizations, hardware coherence protocols are used to maintain memory consistency of this shared memory and implicitly move data to the requesting cores. This thesis aims to provide fault-tolerance against newer models of failure in the organization of memory in datacenter servers. While designing for improved reliability, this thesis explores solutions that can also enhance performance of applications. The solutions build over modern coherence protocols to achieve these properties. First, we observe that DRAM memory system failure rates have increased, demanding stronger forms of memory reliability. To combat this, the thesis proposes Dvé, a hardware driven replication mechanism where data blocks are replicated across two different memory controllers in a cache-coherent NUMA system. Data blocks are accompanied by a code with strong error detection capabilities so that when an error is detected, correction is performed using the replica. Dvé’s organization offers two independent points of access to data which enables: (a) strong error correction that can recover from a range of faults affecting any of the components in the memory and (b) higher performance by providing another nearer point of memory access. Dvé’s coherent replication keeps the replicas in sync for reliability and also provides coherent access to read replicas during fault-free operation for improved performance. Dvé can flexibly provide these benefits on-demand at runtime. Next, we observe that the coherence protocol itself requires to be hardened against failures. Memory in datacenter servers is being disaggregated from the compute servers into dedicated memory servers, driven by standards like CXL. CXL specifies the coherence protocol semantics for compute servers to access and cache data from a shared region in the disaggregated memory. However, the CXL specification lacks the requisite level of fault-tolerance necessary to operate at an inter-server scale within the datacenter. Compute servers can fail or be unresponsive in the datacenter and therefore, it is important that the coherence protocol remain available in the presence of such failures. The thesis proposes Āpta, a CXL-based, shared disaggregated memory system for keeping the cached data consistent without compromising availability in the face of compute server failures. Āpta architects a high-performance fault-tolerant object-granular memory server that significantly improves performance for stateless function-as-a-service (FaaS) datacenter applications

    Dvé:Improving DRAM reliability and performance on-demand via coherent replication

    Get PDF

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    Simplified Model for Persistent Sliding Contact

    Get PDF
    Numerous engineering applications involve one or more mechanical components that come into contact with each other during operation. Repetitive motions under contact can lead to fatigue and wear problems in the components. When designing such components, it is important to characterize and quantify the stresses and strains in the contacting elements. This can be achieved by numerical simulation of the process and by using one of the several contact formulations available in most finite element software programs. However, contact is an inherently non-linear problem which is rather challenging even for the best commercial software programs currently available. Often contact simulations are plagued by issues of high computational cost and non-convergence that are highly problem dependent. Further, modeling approaches that work for one scenario do not generalize easily for other problems. In several applications, one encounters sliding contact that is persistent. In such cases, components always stay in contact during operation but slide with respect to one another within a small range of motion. An example of such an application is the interlock hose where thin strips of sheet metal are coiled together in a way that adjacent coils lock with each other to form a flexible hose. This flexible hose allows a limited amount of motion between adjacent coils by letting the coils slide with respect to each other while always remaining in locked contact. In this study, a simplified model is developed for applications with persistent sliding contact. The simplified model utilizes slender spring and membrane elements that are stiff in the direction of their orientation but flexible in the transverse direction. The stiff response is used to simulate persistent contact and to prevent gaps or penetration between contacting components and the flexible response is used to create a bi-stable mechanism that mimics sliding between the components. The primary benefit of this approach is that it is far more computationally efficient than conventional approaches for modeling contact with high fidelity. However, given that it is a simplified model, one loses some accuracy in the solution, especially in regions of the model that are actually in contact. Nevertheless, this simplified approach and conventional high-fidelity contact models produce deformations and stresses that are very similar in parts of the model that are away from the immediate region of con- tact. Several numerical examples are presented to illustrate the simplified model and to compare its performance, both in terms of solution accuracy and computational cost, to conventional high-fidelity contact models

    Effect of metal oxide supports on active-Cu for CO/CO2 hydrogenation to methanol

    No full text
    Increasing tensions over global warming, talks about a sustainable future and a huge imbalance in closure of the carbon cycle indicate a response for developing efficient conversion of CO2 and syngas obtained from renewable sources. Thermochemical conversion of carbon oxides (CO and CO2) in combination with hydrogen to produce methanol in the presence of catalyst provides a pathway to close this carbon cycle. Steady state activity tests were carried out in a small integral reactor for methanol synthesis from a mixture of either CO/H2 or CO2/H2. The temperature was varied from 200 to 300°C, while the total pressure was held constant for CO/H2 at 85 bar and CO2/H2 at 60 bar keeping stoichiometric flow of hydrogen at GHSV of 24,000 hr¡1. Four different metal oxides namely ZnO, ZrO2,MgO and CeO2 were investigated for support effects on active Cu along with different combinations among them while keeping commercial catalyst as the benchmark. Catalysts were prepared using urea hydrolysis method. It was found that ZrO2 and MgO show higher selectivity however the latter does not exhibit comparable conversion as the commercial catalyst for CO2 hydrogenation. Detailed GHSV study for Cu-ZrO2 paint a completely different picture showing higher methanol selectivity (64%) with increasing space velocity (at GHSV of 32,000 hr¡1). In case of COhydrogenation, commercial catalyst performs the best, albeit displaying signs of carbon deposition at higher temperature (280°, 300°C). This situation is circumvented by employing ZnO/MgO combination as a support. Cu-CeO2 exhibited characteristics of an excellent water gas shift catalyst. This led to a novel configuration of mixed bed consisting of Cu-CeO2 with commercial catalyst. Results indicate that this combination improves themethanol yield by atleast 30% as compared to commercial catalyst at a high GHSV of 24,000 hr¡1.Chemical Engineerin

    HAShCache: Heterogeneity-Aware Shared DRAMCache for Integrated Heterogeneous Systems

    No full text
    Integrated Heterogeneous System (IHS) processors pack throughput-oriented General-Purpose Graphics Pprocessing Units (GPGPUs) alongside latency-oriented Central Processing Units (CPUs) on the same die sharing certain resources, e.g., shared last-level cache, Network-on-Chip (NoC), and the main memory. The demands for memory accesses and other shared resources from GPU cores can exceed that of CPU cores by two to three orders of magnitude. This disparity poses significant problems in exploiting the full potential of these architectures. In this article, we propose adding a large-capacity stacked DRAM, used as a shared last-level cache, for the IHS processors. However, adding the DRAMCache naively, leaves significant performance on the table due to the disparate demands from CPU and GPU cores for DRAMCache and memory accesses. In particular, the imbalance can significantly reduce the performance benefits that the CPU cores would have otherwise enjoyed with the introduction of the DRAMCache, necessitating a heterogeneity-aware management of this shared resource for improved performance. In this article, we propose three simple techniques to enhance the performance of CPU application while ensuring very little to no performance impact to the GPU. Specifically, we propose (i) PrIS, a prioritization scheme for scheduling CPU requests at the DRAMCache controller; (ii) ByE, a selective and temporal bypassing scheme for CPU requests at the DRAMCache; and (iii) Chaining, an occupancy controlling mechanism for GPU lines in the DRAMCache through pseudo-associativity. The resulting cache, Heterogeneity-Aware Shared DRAMCache (HAShCache), is heterogeneity-aware and can adapt dynamically to address the inherent disparity of demands in an IHS architecture. Experimental evaluation of the proposed HAShCache results in an average system performance improvement of 41% over a naive DRAMCache and over 200% improvement over a baseline system with no stacked DRAMCache
    corecore